81 research outputs found

    Automated mask generation for PIV image analysis based on pixel intensity statistics

    Get PDF

    CaloriNet: From silhouettes to calorie estimation in private environments

    Get PDF
    We propose a novel deep fusion architecture, CaloriNet, for the online estimation of energy expenditure for free living monitoring in private environments, where RGB data is discarded and replaced by silhouettes. Our fused convolutional neural network architecture is trainable end-to-end, to estimate calorie expenditure, using temporal foreground silhouettes alongside accelerometer data. The network is trained and cross-validated on a publicly available dataset, SPHERE_RGBD + Inertial_calorie. Results show state-of-the-art minimum error on the estimation of energy expenditure (calories per minute), outperforming alternative, standard and single-modal techniques.Comment: 11 pages, 7 figure

    Person Re-ID by Fusion of Video Silhouettes and Wearable Signals for Home Monitoring Applications

    Get PDF
    The use of visual sensors for monitoring people in their living environments is critical in processing more accurate health measurements, but their use is undermined by the issue of privacy. Silhouettes, generated from RGB video, can help towards alleviating the issue of privacy to some considerable degree. However, the use of silhouettes would make it rather complex to discriminate between different subjects, preventing a subject-tailored analysis of the data within a free-living, multi-occupancy home. This limitation can be overcome with a strategic fusion of sensors that involves wearable accelerometer devices, which can be used in conjunction with the silhouette video data, to match video clips to a specific patient being monitored. The proposed method simultaneously solves the problem of Person ReID using silhouettes and enables home monitoring systems to employ sensor fusion techniques for data analysis. We develop a multimodal deep-learning detection framework that maps short video clips and accelerations into a latent space where the Euclidean distance can be measured to match video and acceleration streams. We train our method on the SPHERE Calorie Dataset, for which we show an average area under the ROC curve of 76.3% and an assignment accuracy of 77.4%. In addition, we propose a novel triplet loss for which we demonstrate improving performances and convergence speed

    Meta-Learning with Context-Agnostic Initialisations

    Get PDF
    Meta-learning approaches have addressed few-shot problems by finding initialisations suited for fine-tuning to target tasks. Often there are additional properties within training data (which we refer to as context), not relevant to the target task, which act as a distractor to meta-learning, particularly when the target task contains examples from a novel context not seen during training. We address this oversight by incorporating a context-adversarial component into the meta-learning process. This produces an initialisation for fine-tuning to target which is both context-agnostic and task-generalised. We evaluate our approach on three commonly used meta-learning algorithms and two problems. We demonstrate our context-agnostic meta-learning improves results in each case. First, we report on Omniglot few-shot character classification, using alphabets as context. An average improvement of 4.3% is observed across methods and tasks when classifying characters from an unseen alphabet. Second, we evaluate on a dataset for personalised energy expenditure predictions from video, using participant knowledge as context. We demonstrate that context-agnostic meta-learning decreases the average mean square error by 30%

    Temporal-Relational CrossTransformers for Few-Shot Action Recognition

    Get PDF
    We propose a novel approach to few-shot action recognition, finding temporally-corresponding frame tuples between the query and videos in the support set. Distinct from previous few-shot works, we construct class prototypes using the CrossTransformer attention mechanism to observe relevant sub-sequences of all support videos, rather than using class averages or single best matches. Video representations are formed from ordered tuples of varying numbers of frames, which allows sub-sequences of actions at different speeds and temporal offsets to be compared. Our proposed Temporal-Relational CrossTransformers (TRX) achieve state-of-the-art results on few-shot splits of Kinetics, Something-Something V2 (SSv2), HMDB51 and UCF101. Importantly, our method outperforms prior work on SSv2 by a wide margin (12%) due to the its ability to model temporal relations. A detailed ablation showcases the importance of matching to multiple support set videos and learning higher-order relational CrossTransformers.Comment: Accepted in CVPR 202

    Design of Antibody-Functionalized Polymeric Membranes for the Immunoisolation of Pancreatic Islets

    Get PDF
    none8noAn immunoencapsulation strategy for pancreatic islets aimed to reduce the risk of rejection in transplanted patients due to the immune response of the host organism is proposed. In this sense, a polyethylene glycol (PEG) hydrogel functionalized with an immunosuppressive antibody (Ab), such as Cytotoxic T-lymphocyte antigen-4 Ig (CTLA4-Ig), would act as both passive and active barrier to the host immune response. To demonstrate the feasibility of this approach, a photopolymerizable-PEG was conjugated to the selected antibody and the PEG-Ab complex was used to coat the islets. Moreover, to preserve the antigen-recognition site of the antibody during the conjugation process, a controlled immobilization method was setup through the attachment of the His-tagged antigen to a solid support. In detail, a gold-coated silicon wafer functionalized with 11-Mercaptoundecanoic acid was used as a substrate for further modification, leading to a nickel(II)-terminated ligand surface. Then, the immobilized antigen was recognized by the corresponding antibody that was conjugated to the PEG. The antibody-PEG complex was detached from the support prior to be photopolymerized around the islets. First, this immobilization method has been demonstrated for the green fluorescent protein (GFP)–anti-green fluorescent protein (Anti-GFP) antigen-antibody pair, as proof of principle. Then, the approach was extended to the immunorelevant B7-1 CTLA-4-Ig antigen-antibody pair, followed by the binding of Acryl-PEG to the immobilized constant region of the antibody. In both cases, after using an elution protocol, only a partial recovery of the antibody-PEG complex was obtained. Nevertheless, the viability and the functional activity of the encapsulated islets, as determined by the glucose-stimulated insulin secretion (GSIS) assay, showed the good compatibility of this approach.openAnna Cavallo; Ugo Masullo; Alessandra Quarta; Alessandro Sannino; Amilcare Barca; Tiziano Verri; Marta Madaghiele; Laura BlasiCavallo, Anna; Masullo, Ugo; Quarta, Alessandra; Sannino, Alessandro; Barca, Amilcare; Verri, Tiziano; Madaghiele, Marta; Blasi, Laur

    Multimodal Classification of Parkinson's Disease in Home Environments with Resiliency to Missing Modalities

    Get PDF
    Parkinson’s disease (PD) is a chronic neurodegenerative condition that affects a patient’s everyday life. Authors have proposed that a machine learning and sensor-based approach that continuously monitors patients in naturalistic settings can provide constant evaluation of PD and objectively analyse its progression. In this paper, we make progress toward such PD evaluation by presenting a multimodal deep learning approach for discriminating between people with PD and without PD. Specifically, our proposed architecture, named MCPD-Net, uses two data modalities, acquired from vision and accelerometer sensors in a home environment to train variational autoencoder (VAE) models. These are modality-specific VAEs that predict effective representations of human movements to be fused and given to a classification module. During our end-to-end training, we minimise the difference between the latent spaces corresponding to the two data modalities. This makes our method capable of dealing with missing modalities during inference. We show that our proposed multimodal method outperforms unimodal and other multimodal approaches by an average increase in F1-score of 0.25 and 0.09, respectively, on a data set with real patients. We also show that our method still outperforms other approaches by an average increase in F1-score of 0.17 when a modality is missing during inference, demonstrating the benefit of training on multiple modalities
    • …
    corecore